657 research outputs found

    Evolutionary Algorithms for Reinforcement Learning

    Full text link
    There are two distinct approaches to solving reinforcement learning problems, namely, searching in value function space and searching in policy space. Temporal difference methods and evolutionary algorithms are well-known examples of these approaches. Kaelbling, Littman and Moore recently provided an informative survey of temporal difference methods. This article focuses on the application of evolutionary algorithms to the reinforcement learning problem, emphasizing alternative policy representations, credit assignment methods, and problem-specific genetic operators. Strengths and weaknesses of the evolutionary approach to reinforcement learning are presented, along with a survey of representative applications

    Sheffield University CLEF 2000 submission - bilingual track: German to English

    Get PDF
    We investigated dictionary based cross language information retrieval using lexical triangulation. Lexical triangulation combines the results of different transitive translations. Transitive translation uses a pivot language to translate between two languages when no direct translation resource is available. We took German queries and translated then via Spanish, or Dutch into English. We compared the results of retrieval experiments using these queries, with other versions created by combining the transitive translations or created by direct translation. Direct dictionary translation of a query introduces considerable ambiguity that damages retrieval, an average precision 79% below monolingual in this research. Transitive translation introduces more ambiguity, giving results worse than 88% below direct translation. We have shown that lexical triangulation between two transitive translations can eliminate much of the additional ambiguity introduced by transitive translation

    Inheritance-Based Diversity Measures for Explicit Convergence Control in Evolutionary Algorithms

    Full text link
    Diversity is an important factor in evolutionary algorithms to prevent premature convergence towards a single local optimum. In order to maintain diversity throughout the process of evolution, various means exist in literature. We analyze approaches to diversity that (a) have an explicit and quantifiable influence on fitness at the individual level and (b) require no (or very little) additional domain knowledge such as domain-specific distance functions. We also introduce the concept of genealogical diversity in a broader study. We show that employing these approaches can help evolutionary algorithms for global optimization in many cases.Comment: GECCO '18: Genetic and Evolutionary Computation Conference, 2018, Kyoto, Japa

    Genetic algorithms with elitism-based immigrants for changing optimization problems

    Get PDF
    Copyright @ Springer-Verlag Berlin Heidelberg 2007.Addressing dynamic optimization problems has been a challenging task for the genetic algorithm community. Over the years, several approaches have been developed into genetic algorithms to enhance their performance in dynamic environments. One major approach is to maintain the diversity of the population, e.g., via random immigrants. This paper proposes an elitism-based immigrants scheme for genetic algorithms in dynamic environments. In the scheme, the elite from previous generation is used as the base to create immigrants via mutation to replace the worst individuals in the current population. This way, the introduced immigrants are more adapted to the changing environment. This paper also proposes a hybrid scheme that combines the elitism-based immigrants scheme with traditional random immigrants scheme to deal with significant changes. The experimental results show that the proposed elitism-based and hybrid immigrants schemes efficiently improve the performance of genetic algorithms in dynamic environments

    Multiple cyclotron line-forming regions in GX 301-2

    Get PDF
    We present two observations of the high-mass X-ray binary GX 301-2 with NuSTAR, taken at different orbital phases and different luminosities. We find that the continuum is well described by typical phenomenological models, like a very strongly absorbed NPEX model. However, for a statistically acceptable description of the hard X-ray spectrum we require two cyclotron resonant scattering features (CRSF), one at ~35 keV and the other at ~50 keV. Even though both features strongly overlap, the good resolution and sensitivity of NuSTAR allows us to disentangle them at >=99.9% significance. This is the first time that two CRSFs are seen in GX 301-2. We find that the CRSFs are very likely independently formed, as their energies are not harmonically related and, if it were a single line, the deviation from a Gaussian shape would be very large. We compare our results to archival Suzaku data and find that our model also provides a good fit to those data. We study the behavior of the continuum as well as the CRSF parameters as function of pulse phase in seven phase bins. We find that the energy of the 35 keV CRSF varies smoothly as function of phase, between 30-38 keV. To explain this variation, we apply a simple model of the accretion column, taking the altitude of the line-forming region, the velocity of the in-falling material, and the resulting relativistic effects into account. We find that in this model the observed energy variation can be explained simply due to a variation of the projected velocity and beaming factor of the line forming region towards us.Comment: 18 pages, 10 figures, accepted for publication in A&

    Evidence for a Variable Ultrafast Outflow in the Newly Discovered Ultraluminous Pulsar NGC 300 ULX-1

    Get PDF
    Ultraluminous pulsars are a definite proof that persistent super-Eddington accretion occurs in nature. They support the scenario according to which most Ultraluminous X-ray Sources (ULXs) are super-Eddington accretors of stellar mass rather than sub-Eddington intermediate mass black holes. An important prediction of theories of supercritical accretion is the existence of powerful outflows of moderately ionized gas at mildly relativistic speeds. In practice, the spectral resolution of X-ray gratings such as RGS onboard XMM-Newton is required to resolve their observational signatures in ULXs. Using RGS, outflows have been discovered in the spectra of 3 ULXs (none of which are currently known to be pulsars). Most recently, the fourth ultraluminous pulsar was discovered in NGC 300. Here we report detection of an ultrafast outflow (UFO) in the X-ray spectrum of the object, with a significance of more than 3{\sigma}, during one of the two simultaneous observations of the source by XMM-Newton and NuSTAR in December 2016. The outflow has a projected velocity of 65000 km/s (0.22c) and a high ionisation factor with a log value of 3.9. This is the first direct evidence for a UFO in a neutron star ULX and also the first time that this its evidence in a ULX spectrum is seen in both soft and hard X-ray data simultaneously. We find no evidence of the UFO during the other observation of the object, which could be explained by either clumpy nature of the absorber or a slight change in our viewing angle of the accretion flow.Comment: 10 pages, 4 figures. Accepted to MNRA

    A MOS-based Dynamic Memetic Differential Evolution Algorithm for Continuous Optimization: A Scalability Test

    Get PDF
    Continuous optimization is one of the areas with more activity in the field of heuristic optimization. Many algorithms have been proposed and compared on several benchmarks of functions, with different performance depending on the problems. For this reason, the combination of different search strategies seems desirable to obtain the best performance of each of these approaches. This contribution explores the use of a hybrid memetic algorithm based on the multiple offspring framework. The proposed algorithm combines the explorative/exploitative strength of two heuristic search methods that separately obtain very competitive results. This algorithm has been tested with the benchmark problems and conditions defined for the special issue of the Soft Computing Journal on Scalability of Evolutionary Algorithms and other Metaheuristics for Large Scale Continuous Optimization Problems. The proposed algorithm obtained the best results compared with both its composing algorithms and a set of reference algorithms that were proposed for the special issue

    NuSTAR hard X-ray observation of a sub-A class solar flare

    Get PDF
    We report a NuSTAR observation of a solar microflare, SOL2015-09-01T04. Although it was too faint to be observed by the GOES X-ray Sensor, we estimate the event to be an A0.1 class flare in brightness. This microflare, with only 5 counts per second per detector observed by RHESSI, is fainter than any hard X-ray (HXR) flare in the existing literature. The microflare occurred during a solar pointing by the highly sensitive NuSTAR astrophysical observatory, which used its direct focusing optics to produce detailed HXR microflare spectra and images. The microflare exhibits HXR properties commonly observed in larger flares, including a fast rise and more gradual decay, earlier peak time with higher energy, spatial dimensions similar to the RHESSI microflares, and a high-energy excess beyond an isothermal spectral component during the impulsive phase. The microflare is small in emission measure, temperature, and energy, though not in physical size; observations are consistent with an origin via the interaction of at least two magnetic loops. We estimate the increase in thermal energy at the time of the microflare to be 2.4x10^27 ergs. The observation suggests that flares do indeed scale down to extremely small energies and retain what we customarily think of as "flarelike" properties.Comment: Status: Accepted by the Astrophysical Journal, 2017 July 1

    Improving Policy Learning via Language Dynamics Distillation

    Get PDF
    Recent work has shown that augmenting environments with language descriptions improves policy learning. However, for environments with complex language abstractions, learning how to ground language to observations is difficult due to sparse, delayed rewards. We propose Language Dynamics Distillation (LDD), which pretrains a model to predict environment dynamics given demonstrations with language descriptions, and then fine-tunes these language-aware pretrained representations via reinforcement learning (RL). In this way, the model is trained to both maximize expected reward and retain knowledge about how language relates to environment dynamics. On SILG, a benchmark of five tasks with language descriptions that evaluate distinct generalization challenges on unseen environments (NetHack, ALFWorld, RTFM, Messenger, and Touchdown), LDD outperforms tabula-rasa RL, VAE pretraining, and methods that learn from unlabeled demonstrations in inverse RL and reward shaping with pretrained experts. In our analyses, we show that language descriptions in demonstrations improve sample-efficiency and generalization across environments, and that dynamics modeling with expert demonstrations is more effective than with non-experts

    Canalization and Symmetry in Boolean Models for Genetic Regulatory Networks

    Full text link
    Canalization of genetic regulatory networks has been argued to be favored by evolutionary processes due to the stability that it can confer to phenotype expression. We explore whether a significant amount of canalization and partial canalization can arise in purely random networks in the absence of evolutionary pressures. We use a mapping of the Boolean functions in the Kauffman N-K model for genetic regulatory networks onto a k-dimensional Ising hypercube to show that the functions can be divided into different classes strictly due to geometrical constraints. The classes can be counted and their properties determined using results from group theory and isomer chemistry. We demonstrate that partially canalized functions completely dominate all possible Boolean functions, particularly for higher k. This indicates that partial canalization is extremely common, even in randomly chosen networks, and has implications for how much information can be obtained in experiments on native state genetic regulatory networks.Comment: 14 pages, 4 figures; version to appear in J. Phys.
    • 

    corecore